Investigating the Relationship between High Voltage Bias Settings on Instrument Resolution of a Positron Lifetime Spectrometer

Author

Addison Ballif

Introduction

Positron annihilation spectroscopy is a non-destructive way to measure material properties. A positron is an anti-matter electron. When the positron comes in contact with an electron, they both annihilate, resulting in two gamma photons each with 511 keV. In this experiment we use a Na22 source to generate positrons, which also emits a 1274 keV photon in coincidence with the positron. Positron lifetime spectroscopy is a technique in positron annihilation spectroscopy in which the timing difference between the 1274 keV photon and one of the 511 keV photons is measured. This gives the lifetime of the positron. Defect density is of particular interest in the field of positron annihilation spectroscopy. As the defect density increases, the average positron lifetime also increases. Thus measuring average positron lifetime is a way to measure defect density in a material.

In this technique, the two photons are measured using two scintillation crystals each attached to a photo multiplier tube. When the gamma photons reach the scintillators, the scintillators interact with the photons in a process called fluorescence. This results in a small packet of visible light photons, which are converted to an electric signal by the photo multiplier tubes.

Photo multiplier tubes (PMTs) measure small amounts of visible light photons via the photo-electric effect. A photo multiplier tube contains a series of small plates, with a large potential difference across them. As the visible light photons hit the first plate, an electron is easily knocked off due to the high potential difference. The freed electron moves to the next plate, knocking off more electrons with it. As the group of electrons travels from plate to plate, it grows. The size of the potential difference affects how easily it is for electrons to be knocked off of each plate and travel through the tube. Thus the potential difference (or bias) affects the magnitude of the signal, and the time that it takes the signal to travel through the PMT.

A Photo Multiplier Tube. Image Source: Qwerty123uiop; https://commons.wikimedia.org/wiki/File:PhotoMultiplierTubeAndScintillator.svg.

In this experiment we examine the relationship between the biases for each PMT and the resolution of the positron lifetime spectrometer. The instrument response can be measured directly by placing a Co60 sample in front of the detector. Co60 emits two nearly coincident photons when it decays. Measuring the distribution of timing differences for this near 0 timing difference sample allows us to directly measure the resolution of the spectrometer.

This semester our instrument resolution was measured using this method. We expect the instrument resolution to be a single Gaussian peak. However, the measured distribution of timing differences had extra bumps in the signal. The figure below shows the extra bumps we observed.

Our Measured Instrument Resolution

Bumps have been known to be caused if the signal coming from the PMT is too large such that signal clipping occurs internally in the instrument (TechnoAP, 2020). Despite this odd measured distribution, a normal distribution of timing difference is preferred. One of the response variables to examine in this experiment is the normality of the timing distribution.

Since the bias on each PMT also affects how easily photons can be detected, the count rate for the instrument (or the rate at which measurements can be taken) is also affected. The relationship between the bias on each tube and the count rate is another variable of interest in this experiment.

As the potential difference changes across a PMT (by the bias), the time it takes the electron group to move through the PMT changes. That time can be different for each PMT. Thus, the bias settings can affect the average timing difference between the two detectors. Average timing difference is also a variable that we will examine.

Methods

Code Libraries Used In This Experiment
# R Code Libraries
library(reticulate)
library(plotly)
library(dplyr)
library(pander)
# Python Code Libraries
import os
from numpy import loadtxt, array, empty
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from numpy.fft import rfft, irfft
from scipy import stats

For this experiment we collected 52 sampled from a Co60 source using the lifetime spectrometer. The start PMT had a Barium Fluoride BaF2 scintillator and the stop PMT had a fast plastic scintillator. Each sample ran for about 15 minutes. Some samples had to run longer due to access of the instrument being limited. An adjustment was made to address this issue by normalizing each of the measured distributions, which will be discussed later on.

For more information on how the instrument was setup, reference chapters 1 and 2 of our report.

Code for importing the data and calculating response variables

Below is the python code used to import each histogram and obtain the response variables of interest from each measured histogram.

# Load all the data files into an array
filenames = [f for f in os.listdir('BiasRegressionProject/') if '.txt' in f]

# lists of all the variables
allHists = [] #don't use this one in the data frame
allStartScintillator = []
allStopScintillator = []
allStartBias = []
allStopBias = []
allLocation = []
allCountRate = []
allRMS_QQscore = []
allSTD = []

# for each histogram in the folder
for i, filename in enumerate(filenames):
  fields = filename.replace('.txt', '').split('-')
  histdata = loadtxt('BiasRegressionProject/' + filename).astype(float)
  
  histdata = (histdata/max(histdata)*100000) #normalize to 'hide' background
  histdata = histdata.astype(int) # set to int
  
  # smoothing the data using FFT
  c = rfft(histdata)
  c[(len(c)//7):] = 0
  smoothed = irfft(c)

  
  allHists.append(histdata)
  allStartScintillator.append(fields[0])
  allStopScintillator.append(fields[1])
  allStartBias.append(int(fields[2]))
  allStopBias.append(int(fields[3]))
  
  # use the max of the smoothed curve to find the location of the main peak
  allLocation.append(list(smoothed).index(max(smoothed))) 
  
  # get count rate from the .n42 files. 
  with open('BiasRegressionProject/' + filename.replace('.txt', '.n42'), 'r') as file:
    for line in file:
      if '<RealRate>' in line:
        countRate = float(line.replace('      <RealRate>', '').replace('</RealRate>', ''))
        allCountRate.append(countRate)
        break
      
  # recreate the data from the histogram
  all_counts = np.repeat(np.arange(1, len(histdata) + 1), histdata)
  
  # Find the RMS QQ score
  qq_object = stats.probplot(all_counts, dist="norm")
  theoretical_quantiles, sample_quantiles = qq_object[0]
  b1, b0, r = qq_object[1]
  fitted_values = b0 + b1*theoretical_quantiles
  rmsQQ = np.sqrt(np.mean((sample_quantiles - fitted_values)**2))
  allRMS_QQscore.append(rmsQQ)
  
  # Find standard deviations
  allSTD.append(np.std(all_counts))
  
# store all lists in a dictionary
d = {'StartBias':allStartBias, 'StopBias':allStopBias, 'MeanTimeDifference':(array(allLocation) - 1108), 'CountRate':allCountRate, 'RMSQQScore':allRMS_QQscore, 'STD':allSTD}

# create the data frame
fulldf = pd.DataFrame(data=d)

Each of the 52 samples had a different combination of bias settings for the start and stop PMTs. After each sample was taken, the histogram was saved. From each histogram, the three response variables of interest were calculated. Below shows a scatter plot demonstrating all the bias combinations that were sampled.

The mean timing difference between the two tubes was calculated by taking the location of the main central peak, and subtracting the channel at which coincident electric pulses are recorded. The channel to subtract was measured using a pulse generator to simulate two simultaneous pulses. The location of the main central peak was calculated by finding the max of a smoothed version of the histogram. The smoothed histogram was created by using the fast fourier transform and removing high frequency components of the signal.

In addition to the normally distributed component of measured timing differences, there is also usually a constant background throughout the entire distribution. This background component is rarely seen in a 15 minute sample period, but for longer samples, it becomes present. Because some of the samples were recorded for longer periods, some of the samples include a significant background component. To account for this, all samples were normalized by scaling the highest point on the histogram to 100000 counts, and then using a floor function to cast each scaled bin height back to an integer. When this was done, the background present in the longer samples was scaled to a value less than 1 and was then removed with the floor function. Normalizing all the distributions this way allowed the removal of the background on some of the distributions, without significantly changing the shape of the distribution of interest.

Code for normalization and calculating mean timing difference

Below is the python code for normalizing each curve, as well as for calculating the mean timing difference between the two PMTs based on the histogram.

# HOW I GOT THE CENTER POINTS
ypoints = allHists[14].astype(float)
ypoints_notnormal = ypoints.astype(int)
ypoints = (ypoints/max(ypoints)*1000) # scale the histrogram
ypoints = ypoints.astype(int) # cast to int using floor function

# smoothing the function to find 
c = rfft(ypoints)
c[(len(c)//7):] = 0
smoothed = irfft(c)

#calculate the timing difference
timingdiff = list(smoothed).index(max(smoothed)) - 1108

The figure below demonstrates the normalization and finding the mean timing difference between the two PMTs.

The mean timing difference is 61 channels.

The count rate for each sample was calculated by the measurement software.

The normality rating was created by creating a quantile-quantile plot for each sample. A simple linear regression was then performed on the qq plot, and the sum of the squared differences between the fitted line and the dots on the qq plot was used as a measure of how ab-normal the sample was. That sum was divided by the number of occurrences in each sample and then passed through a square root transformation. Thus this measure is essentially the residual standard error of the simple linear regression. A large residual standard error suggests that the distribution of measured timing differences was not normal, while a residual standard error of 0 would suggests that the timing difference were perfectly normally distributed. The measure created this way we will call the RMS QQ Score.

Code for calculating normality score (RMS QQ Score)

Below is the python code demonstrating the calculation for the RMS QQ score, which is used as a measure of the ab-normality of the distribution.

# CALCULATING THE RMS QQ SCORE
ypoints = allHists[3]

# recreate the data from the histogram
data = np.repeat(np.arange(1, len(ypoints) + 1), ypoints.astype(int)) 

# Get the theoretical and sample quantiles
qq_object = stats.probplot(data, dist="norm")
theoretical_quantiles, sample_quantiles = qq_object[0]

# calculate fitted values from linear regression results
b1, b0, r = qq_object[1]
fitted_values = b0 + b1*theoretical_quantiles

# Find the RMS QQ score
rmsQQ = np.sqrt(np.mean((sample_quantiles - fitted_values)**2))

# Plot the QQ plot
plt.clf()
plt.plot(theoretical_quantiles, sample_quantiles, 'o', label='Sample Quantiles')
plt.axline((0, b0), slope=b1, color='orange', label='Simple Linear Regression')
plt.title('QQ Plot')
plt.xlabel('Theoretical Quantiles')
plt.ylabel('Sample Quantiles')
plt.legend()
plt.show()

print(f'The RMS QQ score (or residual standard error) for this histogram is {rmsQQ:.2f}')
The RMS QQ score (or residual standard error) for this histogram is 36.42

Results

The regression results for each response variable are discussed in the sections below.

SPLOM Plot

Below is the SPLOM Plot showing the relationships between each of the variables in the data set.

One interesting pattern is the very linear relationship between the RMS QQ Score and the standard deviation of each measured distribution, which can be viewed in the SPLOM plot. One possible reason for this is that if the linear fit of each qq plot had a near 0 slope, then the residual standard error would be the standard deviation of the data. Despite the tight relation between these two variables, the RMS QQ Score was the measure we used to measure normality of the timing differences in each sample.

Normality vs Bias

The 3d scatter plot below shows the measured relationship between the RMS QQ Score and high voltage bias on each tube.

The mathematical model we used to fit this data is given as \[ \widehat{\text{RMSQQScore}} = \beta_0 + \beta_1 \text{StartBias} + \beta_2 \text{StopBias} + \beta_3 \text{StartBias}^2 + \beta_4 \text{StopBias}^2.\]

The regression results using this model to fit the data shown above are given in the table below.

Code
qqscore.lm <- lm(RMSQQScore ~ StartBias + StopBias + I(StartBias^2) + I(StopBias^2), py$fulldf)
qqscore.b <- coef(qqscore.lm)
summary(qqscore.lm) %>% pander()
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 358.8 46.14 7.775 5.522e-10
StartBias -0.06817 0.02917 -2.337 0.02374
StopBias -0.308 0.03518 -8.755 1.952e-11
I(StartBias^2) 2.088e-05 6.351e-06 3.288 0.001917
I(StopBias^2) 8.89e-05 8.411e-06 10.57 5.195e-14
Fitting linear model: RMSQQScore ~ StartBias + StopBias + I(StartBias^2) + I(StopBias^2)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
52 8.48 0.9438 0.939
Full Regression Summary

Call:
lm(formula = RMSQQScore ~ StartBias + StopBias + I(StartBias^2) + 
    I(StopBias^2), data = py$fulldf)

Residuals:
     Min       1Q   Median       3Q      Max 
-28.3441  -4.6534  -0.3523   6.1902  14.6659 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)     3.588e+02  4.614e+01   7.775 5.52e-10 ***
StartBias      -6.817e-02  2.917e-02  -2.337  0.02374 *  
StopBias       -3.080e-01  3.518e-02  -8.755 1.95e-11 ***
I(StartBias^2)  2.088e-05  6.351e-06   3.288  0.00192 ** 
I(StopBias^2)   8.890e-05  8.411e-06  10.570 5.20e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.48 on 47 degrees of freedom
Multiple R-squared:  0.9438,    Adjusted R-squared:  0.939 
F-statistic: 197.2 on 4 and 47 DF,  p-value: < 2.2e-16

With an adjusted \(r^2\) of 0.939, we felt comfortable with this model’s capability to describe the data. The model uses only the two explanatory variables, which makes it fairly simple. Without physical explanation, there is not a meaningful way to improve the model. Below is a surface plot showing how well this model fits the observed data.

Because the RMS QQ Score is a somewhat arbitrary measure of normality, interpretation of the coefficients in the model does not have much meaning. However, one useful thing about the the model is that it can predict the settings that minimize the RMS QQ Score, and hopefully maximize normality. The table below shows the settings that minimize the RMS QQ score.

StartBias StopBias
1633 1732

This model predicts a minimum RMS QQ Score (suggesting the highest normality) at a start PMT bias of 1633 V and a stop PMT bias of 1732. The measured distribution from the data that is closest to this minimum uses a start bias of 1600 V and a stop PMT bias of 1800 V. The sampled distribution for these settings is shown below.

As can be seen from the plot above, this sample, which is near the minimum RMS QQ Score predicted by our model, clearly has problems with normality. This suggests that the RMS QQ Score method of measuring normality does not adequately capture the presence or absence of these extra bumps.

Bias vs Count Rate

The 3d scatter plot showing the relationship between count rate and bias settings in the data is shown below.

The mathematical model that fits we used to fit this data is given by \[ \widehat{\text{CountRate}} = \beta_0 + \beta_1 \text{StartBias} + \beta_2 \text{StopBias} + \beta_3 \text{StartBias}^2 + \beta_4 \text{StopBias}^2 + \beta_5 \text{StartBias} \cdot \text{StopBias}.\]

The regression results for this model are given below.

Code
countrate.lm <- lm(CountRate ~ StartBias + StopBias + I(StartBias^2) + I(StopBias^2) + StartBias:StopBias, py$fulldf)
summary(countrate.lm) %>% pander()
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -1080 77.16 -13.99 3.454e-18
StartBias 0.6585 0.04649 14.17 2.166e-18
StopBias 0.3942 0.05481 7.191 4.711e-09
I(StartBias^2) -8.466e-05 9.429e-06 -8.979 1.115e-11
I(StopBias^2) -6.594e-05 1.25e-05 -5.274 3.482e-06
StartBias:StopBias -7.78e-05 8.792e-06 -8.849 1.714e-11
Fitting linear model: CountRate ~ StartBias + StopBias + I(StartBias^2) + I(StopBias^2) + StartBias:StopBias
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
52 12.59 0.9632 0.9592
Full regression summary

Call:
lm(formula = CountRate ~ StartBias + StopBias + I(StartBias^2) + 
    I(StopBias^2) + StartBias:StopBias, data = py$fulldf)

Residuals:
    Min      1Q  Median      3Q     Max 
-20.063  -7.582  -0.739   6.935  44.861 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -1.080e+03  7.716e+01 -13.991  < 2e-16 ***
StartBias           6.585e-01  4.649e-02  14.166  < 2e-16 ***
StopBias            3.942e-01  5.481e-02   7.191 4.71e-09 ***
I(StartBias^2)     -8.466e-05  9.429e-06  -8.979 1.12e-11 ***
I(StopBias^2)      -6.594e-05  1.250e-05  -5.274 3.48e-06 ***
StartBias:StopBias -7.780e-05  8.792e-06  -8.849 1.71e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12.59 on 46 degrees of freedom
Multiple R-squared:  0.9632,    Adjusted R-squared:  0.9592 
F-statistic: 240.9 on 5 and 46 DF,  p-value: < 2.2e-16

This model is simple enough that with an adjusted \(r^2\) of 0.959, we feel comfortable with this model’s ability to describe the general pattern happening in the data.

Visualizing the fitted model

Below is a surface plot showing how well this model fits the observed data.

The quadratic term makes it difficult to interpret how changing the bias by a certain voltage on either PMT will change count rate, but this model can predict a count rate for the detector given bias settings. Note that the actual count rate will be different depending on the CFD filters applied when using a Na22 sample, as well as the placement of the sample and the source. It is useful however to think about when the maximum count rate given different bias settings.

The table below show the bias settings that this model predicts will yield the highest count rate.

StartBias StopBias
3451 953

Unfortunately, this maximum count rate is not in the range of bias settings reasonable for the PMT tubes. Though a high count rate is desired, having a normal distribution of timing difference is important. Looking at the 3d scatter plot of the measured data, the measured sample that has the highest count rate and still achieves a low RMS QQ Score has a start PMT bias of 2450 V and stop PMT bias of 1750 V.

In general, a higher start bias and lower stop bias will yield the best count rate.

Mean Timing Difference vs Bias

The mean timing difference is the average delay between the PMTs for a coincident pair of photons. The 3d scatter plot showing the relationship between mean timing difference and bias settings in the data is shown below.

The mathematical model that fits we used to fit this data is given by \[ \widehat{\text{MeanTimeDiff}} = \beta_0 + \beta_1 \text{StartBias} = \beta_2 \text{StopBias}. \]

The regression results for this model are given below.

Code
timediff.lm <- lm(MeanTimeDifference ~ StartBias + StopBias, py$fulldf)
summary(timediff.lm) %>% pander()
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 84.35 16.85 5.007 7.56e-06
StartBias 0.2219 0.005643 39.32 1.003e-38
StopBias -0.2562 0.006282 -40.79 1.765e-39
Fitting linear model: MeanTimeDifference ~ StartBias + StopBias
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
52 18.99 0.9829 0.9822
Full regression summary

Call:
lm(formula = MeanTimeDifference ~ StartBias + StopBias, data = py$fulldf)

Residuals:
    Min      1Q  Median      3Q     Max 
-37.935 -13.061  -0.468  12.704  33.711 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 84.349040  16.847246   5.007 7.56e-06 ***
StartBias    0.221905   0.005643  39.323  < 2e-16 ***
StopBias    -0.256231   0.006282 -40.785  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.99 on 49 degrees of freedom
Multiple R-squared:  0.9829,    Adjusted R-squared:  0.9822 
F-statistic:  1409 on 2 and 49 DF,  p-value: < 2.2e-16

This model is simple enough that with an adjusted \(r^2\) of 0.982, we feel that the model adequately describes the relationship in the data. The plot below shows how the model fits the data.

As the start PMT bias increases by 100 volts, the mean timing difference between the tubes (or the measured timing difference between the photons) increases by 22 channels. For the calibration of the detector we used in Spring 2024, this would correspond to a timing difference of 375 pico-seconds. This makes sense because as you increase the start PMT bias, that PMT will respond faster compared to the stop bias, causing the stop PMT to lag longer relative to the start PMT.

As the stop PMT bias increases by 100 volts, the mean timing difference between the tubes (or the measured timing difference between the photons) decreases by 26 channels. For the calibration of the detector we used in Spring 2024, this would correspond to a timing difference of 433 pico-seconds. This makes sense because as the stop PMT bias decreases, the stop PMT will respond faster and will catch up relative to the start PMT.

Though it is not necessary for the mean timing difference to be 0, our model suggests that to achieve a mean timing difference of 0, the relationship between the stop and start bias would need to follow \[ \text{StopBias} = 329.2 \text{ Volts}+ 0.866 \cdot \text{StartBias}.\]

This formula gives the StartBias and StopBias in positive voltages, even though the actual bias is negative. For a start PMT bias of 2400 V, the stop PMT bias would have to be 2408 V. For a start PMT bias of 1600 V, the stop PMT bias would have to be 1714 V. For a start PMT bias of 2800 V, the stop PMT bias would have to be 2754 V. This may be useful to know, but the mean timing difference is not usually critical in the analysis of positron lifetime spectroscopy data.

Discussion

This experiment demonstrated an interesting relationship between the mean timing difference between the two tubes and the bias on each tube. Recognizing this pattern, it is useful to have a way to predict what the mean timing difference would be. Currently, the popular methods for analyzing positron lifetime spectroscopy data do not require the mean timing difference to be known, but many curve fit methods do require a “time zero” value, and the patterns observed in this experiment could be used to calculate a more confident guess for the “time zero” parameter.

Understanding the relationship between count rate and bias on each PMT is useful for maximizing the speed at which data can be measured. Although the \(r^2\) was quite good, and the model is able to describe the general pattern, there are places where the model seems to undershoot the data, and other places where it overestimates a little bit. Finding a more meaningful theoretical explanation could aid in the improvement of this model. Describing the movement of the electrons inside the PMT tube would be part of this.

The RMS QQ Score did exhibit a pattern similar to what we expected. However, when plotting a sample near the minimum RMS QQ Score predicted by the model, the plotted sample still contained significant bumps. This suggests that the RMS QQ Score did not do an excellent job at quantifying the presence or absence of extra bumps in the signal. A new analysis with the same data utilizing a better method for quantifying the extra bumps could be an useful place for future work. One possible way of quantifying these extra peaks would be to use some method to identify the peaks, then compare the ratio of peak heights to the main peak. This would also allow the performance of a separate regression for peaks on the left and on the right, which would also improve interpretability.

From longer instrument resolution samples that have been taken, it is sometimes possible to see smaller extra peaks that are not visible in a 15 minute sample. The background signal is also something of interesting shape. If worth the time and effort (and with a better way to measure of the presence of extra peaks), it could be interesting to perform this analysis again, performing each sample collection for a longer period of time.

Considering the extra peaks, this study suggests that lowering the bias on both PMTs does increase the normality of the measured distribution, which has some connection with decreasing the extra peaks. The TechnoAP manual suggests that one possible cause for the extra peaks is that the signal will clip when entering the discriminators if the signal amplitude is too large, which does have a relationship with the PMT bias (TechnoAP, 2020). Some in the field have implemented the use an attenuator in between the PMT and the discriminator on each channel. TechnoAP mentions that this would decrease the precision of the instrument (TechnoAP, 2020), though that may not be as big of an issue if it removes the extra bumps. Wieland also implemented the use of attenuators when examining the relationship between PMT bias on each channel and resolution FWHM (Wieland). Repeating the experiment, or performing other measurements, with an attenuator as a means to decrease signal amplitude before the discriminators in each channel could be a useful thing to try. One possible advantage of the attenuators, if the clipping hypothesis is correct, is that it would allow the choosing of biases that increase count rate without hurting the normality of the signal.

Summary of possible future work

  • Perform the same analysis on the same data with an improved method of quantifying the presence or absence of extra bumps.
  • Find a physically meaningful model for count rate to improve the count rate vs PMT bias model.
  • Perform the same analysis collecting longer samples to investigate background shape and other extra bumps.
  • Implement an attenuator between the PMT and discriminator as a means to remove extra bumps from the signal.